Lexical similarity can distinguish between automatic and manual translation
نویسندگان
چکیده
We consider the problem of identifying automatic translations from manual translations of the same sentence. Using two different similarity metrics (BLEU and Levenshtein edit distance), we found out that automatic translations are closer to each other than they are to manual translations. We also use phylogenetic trees to provide a visual representation of the distances between pairs of individual sentences in a set of translations. The differences in lexical distance are statistically significant, both for Chinese to English and for Arabic to English translations.
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملLexical similarity can distinguish between automatic and manual translations
We consider the problem of identifying automatic translations from manual translations of the same sentence. Using two different similarity metrics (BLEU and Levenshtein edit distance), we found out that automatic translations are closer to each other than they are to manual translations. We also use phylogenetic trees to provide a visual representation of the distances between pairs of individ...
متن کاملAutomatic Construction of Persian ICT WordNet using Princeton WordNet
WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...
متن کاملMachine Translation Evaluation: A Survey
This paper introduces the state-of-the-art MT evaluation survey that contains both manual and automatic evaluation methods. The traditional human evaluation criteria mainly include the intelligibility, fidelity, fluency, adequacy, comprehension, and informativeness. We classify the automatic evaluation methods into two categories, including lexical similarity and linguistic features application...
متن کاملExamining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework
Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...
متن کامل